DATA 607: Final Project

Author

Curtis Elsasser

Setup

library(kableExtra)
library(tidyverse)
source("wrangle.R")

Overview

todo: start with your proposal

What is MIDI?

MIDI stands for Musical Instrument Digital Interface. It is a protocol that allows electronic musical instruments, computers, and other devices to communicate with each other. MIDI data is a series of messages that tell a device what notes to play, how loud to play them, and when to play them. The protocol also includes messages that manipulate the playback instrument’s properties, creating effects such as pitch-bend, modulation, volume changes, etc.. MIDI files are a way of storing this data so that it can be played back on different devices. MIDI files are not audio files; they do not contain sound. Instead, they contain instructions on how to play a piece of music. This, in my opinion, is what makes them dreamy to work with. Because they are reduced to their most fundamental form, they are easy to manipulate and analyze.

todo: add an excerpt from a midi file and include an excerpt of the same piece being performed by a human.

Wrangling

todo: talk about what wrangling we did in this project

Manifest

It’s not a ship’s log, so I don’t know if “manifest” is the best description of the file which manages metadata for this repository. It is named metadata.csv, but that is pretty generic. It doesn’t distinguish him from all of the other metadata in this repository. But I couldn’t think of a better, short term for it, so I’m sticking with manifest. The manifest contains composer information, the composition title, the performer who performed the performance, the score of the composition and more. We are primarily concerned with the following elements of data:

Column Description Type
composer Composer’s last name string
year_born Composer’s birth year (Wikipedia) integer
year_died Composer’s death year (Wikipedia) integer
year_midlife Composer’s death year (Wikipedia) integer
title Composition’s title string
performer The performer. Extracted from midi_performance string
midi_score The score. Relative path to the MIDI file string
midi_performance The performance. Relative path to the MIDI file string
csv_score The score CSV. Relative path to the CSV file string
csv_performance The performance CSV. Relative path to the CSV file string
Performance vs. Score

The performance and score are very closely related, but they are not the same. The score is the composition as written by the composer. The performance is the composition as played by the performer. Classical is a very structured genre, but the performance of it is very expressive. It’s difficult to reproduce it in it’s entirety with metadata. Where the tempo, key-signature and time-signature are meaningful in the score, they are meaningless in the performances in this repository. According to their metadata they all look as if the were written in 4/4, the key of C and at 120 BPM. This is not the case. The performances are all unique and expressive. The score is the blueprint, the performance is the building.

The manifest is the key to the dataset. It tells us where to find the data and how to interpret it. And that is where we shall start.

Load

tbl_manifest <- load_manifest()
tbl_scores <- tbl_manifest$scores
tbl_perfs <- tbl_manifest$perfs

Let’s get an idea of what their insides look like.

Composers

tbl_scores |>
  nest_by(composer, year_born, year_died) |>
  arrange(year_born) |>
  mutate(composer = factor(composer)) |>
  ggplot(mapping = aes(x = year_born, y = composer, color = composer)) +
  geom_segment(mapping = aes(xend = year_died), linewidth = 6, show.legend = FALSE) +
  labs(
    title = "Composers and their Lifespans", 
    x = "Lifetime", 
    y = "Composer"
  )

Scores

Follows is a list of all 235 scores in the repository. It’s long but I think it gives an excellent overview of the dataset. So I’m going to tuck it into a collapsible note. I shall determine his fate later.

tbl_scores |>
  select(id:title) |>
  kable()
id composer year_born year_died year_midlife title
1 Bach 1685 1750 1717.5 Fugue bwv 846
2 Bach 1685 1750 1717.5 Fugue bwv 848
3 Bach 1685 1750 1717.5 Fugue bwv 854
4 Bach 1685 1750 1717.5 Fugue bwv 856
5 Bach 1685 1750 1717.5 Fugue bwv 857
6 Bach 1685 1750 1717.5 Fugue bwv 858
7 Bach 1685 1750 1717.5 Fugue bwv 860
8 Bach 1685 1750 1717.5 Fugue bwv 862
9 Bach 1685 1750 1717.5 Fugue bwv 863
10 Bach 1685 1750 1717.5 Fugue bwv 864
11 Bach 1685 1750 1717.5 Fugue bwv 865
12 Bach 1685 1750 1717.5 Fugue bwv 866
13 Bach 1685 1750 1717.5 Fugue bwv 867
14 Bach 1685 1750 1717.5 Fugue bwv 868
15 Bach 1685 1750 1717.5 Fugue bwv 870
16 Bach 1685 1750 1717.5 Fugue bwv 873
17 Bach 1685 1750 1717.5 Fugue bwv 874
18 Bach 1685 1750 1717.5 Fugue bwv 875
19 Bach 1685 1750 1717.5 Fugue bwv 876
20 Bach 1685 1750 1717.5 Fugue bwv 880
21 Bach 1685 1750 1717.5 Fugue bwv 883
22 Bach 1685 1750 1717.5 Fugue bwv 884
23 Bach 1685 1750 1717.5 Fugue bwv 885
24 Bach 1685 1750 1717.5 Fugue bwv 887
25 Bach 1685 1750 1717.5 Fugue bwv 888
26 Bach 1685 1750 1717.5 Fugue bwv 889
27 Bach 1685 1750 1717.5 Fugue bwv 891
28 Bach 1685 1750 1717.5 Fugue bwv 892
29 Bach 1685 1750 1717.5 Fugue bwv 893
30 Bach 1685 1750 1717.5 Italian concerto
31 Bach 1685 1750 1717.5 Prelude bwv 846
32 Bach 1685 1750 1717.5 Prelude bwv 848
33 Bach 1685 1750 1717.5 Prelude bwv 854
34 Bach 1685 1750 1717.5 Prelude bwv 856
35 Bach 1685 1750 1717.5 Prelude bwv 857
36 Bach 1685 1750 1717.5 Prelude bwv 858
37 Bach 1685 1750 1717.5 Prelude bwv 860
38 Bach 1685 1750 1717.5 Prelude bwv 862
39 Bach 1685 1750 1717.5 Prelude bwv 863
40 Bach 1685 1750 1717.5 Prelude bwv 864
41 Bach 1685 1750 1717.5 Prelude bwv 865
42 Bach 1685 1750 1717.5 Prelude bwv 866
43 Bach 1685 1750 1717.5 Prelude bwv 867
44 Bach 1685 1750 1717.5 Prelude bwv 868
45 Bach 1685 1750 1717.5 Prelude bwv 870
46 Bach 1685 1750 1717.5 Prelude bwv 873
47 Bach 1685 1750 1717.5 Prelude bwv 874
48 Bach 1685 1750 1717.5 Prelude bwv 875
49 Bach 1685 1750 1717.5 Prelude bwv 876
50 Bach 1685 1750 1717.5 Prelude bwv 880
51 Bach 1685 1750 1717.5 Prelude bwv 883
52 Bach 1685 1750 1717.5 Prelude bwv 884
53 Bach 1685 1750 1717.5 Prelude bwv 885
54 Bach 1685 1750 1717.5 Prelude bwv 887
55 Bach 1685 1750 1717.5 Prelude bwv 888
56 Bach 1685 1750 1717.5 Prelude bwv 889
57 Bach 1685 1750 1717.5 Prelude bwv 891
58 Bach 1685 1750 1717.5 Prelude bwv 892
59 Bach 1685 1750 1717.5 Prelude bwv 893
60 Balakirev 1837 1910 1873.5 Islamey
61 Beethoven 1770 1827 1798.5 Piano Sonatas 1-1
62 Beethoven 1770 1827 1798.5 Piano Sonatas 10-1
63 Beethoven 1770 1827 1798.5 Piano Sonatas 11-1
64 Beethoven 1770 1827 1798.5 Piano Sonatas 11-2
65 Beethoven 1770 1827 1798.5 Piano Sonatas 11-3
66 Beethoven 1770 1827 1798.5 Piano Sonatas 12-1
67 Beethoven 1770 1827 1798.5 Piano Sonatas 13-4
68 Beethoven 1770 1827 1798.5 Piano Sonatas 14-3
69 Beethoven 1770 1827 1798.5 Piano Sonatas 15-1
70 Beethoven 1770 1827 1798.5 Piano Sonatas 15-4
71 Beethoven 1770 1827 1798.5 Piano Sonatas 16-1
72 Beethoven 1770 1827 1798.5 Piano Sonatas 16-2
73 Beethoven 1770 1827 1798.5 Piano Sonatas 17-1
74 Beethoven 1770 1827 1798.5 Piano Sonatas 17-1
75 Beethoven 1770 1827 1798.5 Piano Sonatas 17-2
76 Beethoven 1770 1827 1798.5 Piano Sonatas 17-3
77 Beethoven 1770 1827 1798.5 Piano Sonatas 18-1
78 Beethoven 1770 1827 1798.5 Piano Sonatas 18-2
79 Beethoven 1770 1827 1798.5 Piano Sonatas 18-2
80 Beethoven 1770 1827 1798.5 Piano Sonatas 18-3
81 Beethoven 1770 1827 1798.5 Piano Sonatas 18-4
82 Beethoven 1770 1827 1798.5 Piano Sonatas 2-1
83 Beethoven 1770 1827 1798.5 Piano Sonatas 21-1
84 Beethoven 1770 1827 1798.5 Piano Sonatas 21-1
85 Beethoven 1770 1827 1798.5 Piano Sonatas 21-2
86 Beethoven 1770 1827 1798.5 Piano Sonatas 21-3
87 Beethoven 1770 1827 1798.5 Piano Sonatas 22-1
88 Beethoven 1770 1827 1798.5 Piano Sonatas 22-2
89 Beethoven 1770 1827 1798.5 Piano Sonatas 23-1
90 Beethoven 1770 1827 1798.5 Piano Sonatas 24-1
91 Beethoven 1770 1827 1798.5 Piano Sonatas 24-1
92 Beethoven 1770 1827 1798.5 Piano Sonatas 24-1
93 Beethoven 1770 1827 1798.5 Piano Sonatas 24-2
94 Beethoven 1770 1827 1798.5 Piano Sonatas 26-1
95 Beethoven 1770 1827 1798.5 Piano Sonatas 26-1
96 Beethoven 1770 1827 1798.5 Piano Sonatas 26-2
97 Beethoven 1770 1827 1798.5 Piano Sonatas 26-3
98 Beethoven 1770 1827 1798.5 Piano Sonatas 27-1
99 Beethoven 1770 1827 1798.5 Piano Sonatas 27-2
100 Beethoven 1770 1827 1798.5 Piano Sonatas 28-1
101 Beethoven 1770 1827 1798.5 Piano Sonatas 28-2
102 Beethoven 1770 1827 1798.5 Piano Sonatas 29-2
103 Beethoven 1770 1827 1798.5 Piano Sonatas 29-3
104 Beethoven 1770 1827 1798.5 Piano Sonatas 29-4
105 Beethoven 1770 1827 1798.5 Piano Sonatas 3-1
106 Beethoven 1770 1827 1798.5 Piano Sonatas 3-2
107 Beethoven 1770 1827 1798.5 Piano Sonatas 30-1
108 Beethoven 1770 1827 1798.5 Piano Sonatas 31-1
109 Beethoven 1770 1827 1798.5 Piano Sonatas 31-2
110 Beethoven 1770 1827 1798.5 Piano Sonatas 31-3 4
111 Beethoven 1770 1827 1798.5 Piano Sonatas 32-1
112 Beethoven 1770 1827 1798.5 Piano Sonatas 4-1
113 Beethoven 1770 1827 1798.5 Piano Sonatas 5-1
114 Beethoven 1770 1827 1798.5 Piano Sonatas 7-1
115 Beethoven 1770 1827 1798.5 Piano Sonatas 7-2
116 Beethoven 1770 1827 1798.5 Piano Sonatas 7-3
117 Beethoven 1770 1827 1798.5 Piano Sonatas 7-4
118 Beethoven 1770 1827 1798.5 Piano Sonatas 8-1
119 Beethoven 1770 1827 1798.5 Piano Sonatas 8-2
120 Beethoven 1770 1827 1798.5 Piano Sonatas 8-3
121 Beethoven 1770 1827 1798.5 Piano Sonatas 9-1
122 Beethoven 1770 1827 1798.5 Piano Sonatas 9-2
123 Beethoven 1770 1827 1798.5 Piano Sonatas 9-3
124 Brahms 1833 1897 1865.0 Six Pieces op 118 2
125 Chopin 1810 1849 1829.5 Ballades 1
126 Chopin 1810 1849 1829.5 Ballades 2
127 Chopin 1810 1849 1829.5 Ballades 3
128 Chopin 1810 1849 1829.5 Ballades 4
129 Chopin 1810 1849 1829.5 Barcarolle
130 Chopin 1810 1849 1829.5 Berceuse op 57
131 Chopin 1810 1849 1829.5 Etudes op 10 1
132 Chopin 1810 1849 1829.5 Etudes op 10 10
133 Chopin 1810 1849 1829.5 Etudes op 10 12
134 Chopin 1810 1849 1829.5 Etudes op 10 2
135 Chopin 1810 1849 1829.5 Etudes op 10 3
136 Chopin 1810 1849 1829.5 Etudes op 10 4
137 Chopin 1810 1849 1829.5 Etudes op 10 5
138 Chopin 1810 1849 1829.5 Etudes op 10 7
139 Chopin 1810 1849 1829.5 Etudes op 10 8
140 Chopin 1810 1849 1829.5 Etudes op 25 1
141 Chopin 1810 1849 1829.5 Etudes op 25 10
142 Chopin 1810 1849 1829.5 Etudes op 25 11
143 Chopin 1810 1849 1829.5 Etudes op 25 12
144 Chopin 1810 1849 1829.5 Etudes op 25 2
145 Chopin 1810 1849 1829.5 Etudes op 25 4
146 Chopin 1810 1849 1829.5 Etudes op 25 5
147 Chopin 1810 1849 1829.5 Etudes op 25 8
148 Chopin 1810 1849 1829.5 Polonaises 53
149 Chopin 1810 1849 1829.5 Scherzos 20
150 Chopin 1810 1849 1829.5 Scherzos 31
151 Chopin 1810 1849 1829.5 Scherzos 39
152 Chopin 1810 1849 1829.5 Sonata 2 1st
153 Chopin 1810 1849 1829.5 Sonata 2 2nd
154 Chopin 1810 1849 1829.5 Sonata 2 2nd
155 Chopin 1810 1849 1829.5 Sonata 2 3rd
156 Chopin 1810 1849 1829.5 Sonata 2 3rd
157 Chopin 1810 1849 1829.5 Sonata 2 4th
158 Chopin 1810 1849 1829.5 Sonata 3 2nd
159 Chopin 1810 1849 1829.5 Sonata 3 3rd
160 Chopin 1810 1849 1829.5 Sonata 3 4th
161 Debussy 1862 1918 1890.0 Images Book 1 1 Reflets dans lEau
162 Debussy 1862 1918 1890.0 Pour le Piano 1
163 Glinka 1804 1857 1830.5 The Lark
164 Haydn 1732 1809 1770.5 Keyboard Sonatas 31-1
165 Haydn 1732 1809 1770.5 Keyboard Sonatas 32-1
166 Haydn 1732 1809 1770.5 Keyboard Sonatas 32-1
167 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-1
168 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-2
169 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-3
170 Haydn 1732 1809 1770.5 Keyboard Sonatas 46-1
171 Haydn 1732 1809 1770.5 Keyboard Sonatas 48-1
172 Haydn 1732 1809 1770.5 Keyboard Sonatas 48-2
173 Haydn 1732 1809 1770.5 Keyboard Sonatas 49-1
174 Haydn 1732 1809 1770.5 Keyboard Sonatas 50-1
175 Haydn 1732 1809 1770.5 Keyboard Sonatas 6-1
176 Liszt 1811 1886 1848.5 Annees de pelerinage 2 1 Gondoliera
177 Liszt 1811 1886 1848.5 Ballade 2
178 Liszt 1811 1886 1848.5 Concert Etude S145 1
179 Liszt 1811 1886 1848.5 Concert Etude S145 2
180 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 2 La campanella
181 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 6 Theme and Variations
182 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 6 Theme and Variations
183 Liszt 1811 1886 1848.5 Hungarian Rhapsodies 6
184 Liszt 1811 1886 1848.5 Mephisto Waltz
185 Liszt 1811 1886 1848.5 Sonata
186 Liszt 1811 1886 1848.5 Transcendental Etudes 1
187 Liszt 1811 1886 1848.5 Transcendental Etudes 10
188 Liszt 1811 1886 1848.5 Transcendental Etudes 11
189 Liszt 1811 1886 1848.5 Transcendental Etudes 3
190 Liszt 1811 1886 1848.5 Transcendental Etudes 4
191 Liszt 1811 1886 1848.5 Transcendental Etudes 5
192 Liszt 1811 1886 1848.5 Transcendental Etudes 9
193 Mozart 1756 1791 1773.5 Fantasie 475
194 Mozart 1756 1791 1773.5 Piano Sonatas 11-3
195 Mozart 1756 1791 1773.5 Piano Sonatas 12-1
196 Mozart 1756 1791 1773.5 Piano Sonatas 12-2
197 Mozart 1756 1791 1773.5 Piano Sonatas 12-3
198 Mozart 1756 1791 1773.5 Piano Sonatas 8-1
199 Prokofiev 1891 1953 1922.0 Toccata
200 Rachmaninoff 1873 1943 1908.0 Preludes op 23 4
201 Rachmaninoff 1873 1943 1908.0 Preludes op 23 6
202 Rachmaninoff 1873 1943 1908.0 Preludes op 32 10
203 Rachmaninoff 1873 1943 1908.0 Preludes op 32 5
204 Ravel 1875 1937 1906.0 Gaspard de la Nuit 1 Ondine
205 Ravel 1875 1937 1906.0 Miroirs 3 Une Barque
206 Ravel 1875 1937 1906.0 Miroirs 4 Alborada del gracioso
207 Ravel 1875 1937 1906.0 Pavane
208 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 1
209 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 2
210 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 3
211 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 4
212 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 4
213 Schubert 1797 1828 1812.5 Impromptu op142 1
214 Schubert 1797 1828 1812.5 Impromptu op142 3
215 Schubert 1797 1828 1812.5 Moment Musical no 1
216 Schubert 1797 1828 1812.5 Moment musical no 3
217 Schubert 1797 1828 1812.5 Piano Sonatas 664-1
218 Schubert 1797 1828 1812.5 Piano Sonatas 664-2
219 Schubert 1797 1828 1812.5 Piano Sonatas 664-3
220 Schubert 1797 1828 1812.5 Piano Sonatas 894-2
221 Schubert 1797 1828 1812.5 Piano Sonatas 894-2
222 Schubert 1797 1828 1812.5 Wanderer fantasie
223 Schumann 1810 1856 1833.0 Arabeske
224 Schumann 1810 1856 1833.0 Kreisleriana 1
225 Schumann 1810 1856 1833.0 Kreisleriana 1
226 Schumann 1810 1856 1833.0 Kreisleriana 2
227 Schumann 1810 1856 1833.0 Kreisleriana 3
228 Schumann 1810 1856 1833.0 Kreisleriana 4
229 Schumann 1810 1856 1833.0 Kreisleriana 5
230 Schumann 1810 1856 1833.0 Kreisleriana 6
231 Schumann 1810 1856 1833.0 Kreisleriana 7
232 Schumann 1810 1856 1833.0 Toccata
233 Schumann 1810 1856 1833.0 Toccata repeat
234 Scriabin 1872 1915 1893.5 Etudes op 8 11
235 Scriabin 1872 1915 1893.5 Sonatas 5
Performers

Performances by Performer. From most recordings to least

tbl_top_performers <- tbl_perfs |>
  group_by(performer) |>
  summarise(n = n()) |>
  arrange(desc(n)) |>
  filter(n >= 10) |>
  left_join(tbl_perfs, by = "performer") |>
  arrange(desc(n))

tbl_top_performers |>
  select(performer:title) |>
  kable()
performer n id composer year_born year_died year_midlife title
Huang 17 67 Bach 1685 1750 1717.5 Fugue bwv 885
Huang 17 151 Bach 1685 1750 1717.5 Prelude bwv 885
Huang 17 318 Beethoven 1770 1827 1798.5 Piano Sonatas 26-1
Huang 17 324 Beethoven 1770 1827 1798.5 Piano Sonatas 26-2
Huang 17 329 Beethoven 1770 1827 1798.5 Piano Sonatas 26-3
Huang 17 376 Beethoven 1770 1827 1798.5 Piano Sonatas 31-1
Huang 17 643 Chopin 1810 1849 1829.5 Etudes op 25 11
Huang 17 779 Haydn 1732 1809 1770.5 Keyboard Sonatas 50-1
Huang 17 780 Haydn 1732 1809 1770.5 Keyboard Sonatas 50-1
Huang 17 863 Liszt 1811 1886 1848.5 Sonata
Huang 17 866 Liszt 1811 1886 1848.5 Transcendental Etudes 1
Huang 17 871 Liszt 1811 1886 1848.5 Transcendental Etudes 10
Huang 17 884 Liszt 1811 1886 1848.5 Transcendental Etudes 11
Huang 17 885 Liszt 1811 1886 1848.5 Transcendental Etudes 3
Huang 17 890 Liszt 1811 1886 1848.5 Transcendental Etudes 4
Huang 17 899 Liszt 1811 1886 1848.5 Transcendental Etudes 5
Huang 17 989 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 4
Na 15 174 Balakirev 1837 1910 1873.5 Islamey
Na 15 175 Balakirev 1837 1910 1873.5 Islamey
Na 15 296 Beethoven 1770 1827 1798.5 Piano Sonatas 23-1
Na 15 382 Beethoven 1770 1827 1798.5 Piano Sonatas 31-1
Na 15 394 Beethoven 1770 1827 1798.5 Piano Sonatas 31-2
Na 15 404 Beethoven 1770 1827 1798.5 Piano Sonatas 31-3 4
Na 15 445 Beethoven 1770 1827 1798.5 Piano Sonatas 8-1
Na 15 446 Beethoven 1770 1827 1798.5 Piano Sonatas 8-2
Na 15 447 Beethoven 1770 1827 1798.5 Piano Sonatas 8-3
Na 15 492 Chopin 1810 1849 1829.5 Barcarolle
Na 15 493 Chopin 1810 1849 1829.5 Barcarolle
Na 15 529 Chopin 1810 1849 1829.5 Etudes op 10 10
Na 15 541 Chopin 1810 1849 1829.5 Etudes op 10 12
Na 15 1063 Scriabin 1872 1915 1893.5 Sonatas 5
Na 15 1064 Scriabin 1872 1915 1893.5 Sonatas 5
Lee 13 3 Bach 1685 1750 1717.5 Fugue bwv 848
Lee 13 90 Bach 1685 1750 1717.5 Prelude bwv 848
Lee 13 196 Beethoven 1770 1827 1798.5 Piano Sonatas 15-1
Lee 13 217 Beethoven 1770 1827 1798.5 Piano Sonatas 17-1
Lee 13 319 Beethoven 1770 1827 1798.5 Piano Sonatas 26-1
Lee 13 325 Beethoven 1770 1827 1798.5 Piano Sonatas 26-2
Lee 13 330 Beethoven 1770 1827 1798.5 Piano Sonatas 26-3
Lee 13 433 Beethoven 1770 1827 1798.5 Piano Sonatas 7-1
Lee 13 769 Haydn 1732 1809 1770.5 Keyboard Sonatas 49-1
Lee 13 924 Mozart 1756 1791 1773.5 Piano Sonatas 8-1
Lee 13 980 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 3
Lee 13 1016 Schubert 1797 1828 1812.5 Piano Sonatas 894-2
Lee 13 1022 Schubert 1797 1828 1812.5 Piano Sonatas 894-2
Lin 12 5 Bach 1685 1750 1717.5 Fugue bwv 848
Lin 12 92 Bach 1685 1750 1717.5 Prelude bwv 848
Lin 12 356 Beethoven 1770 1827 1798.5 Piano Sonatas 3-1
Lin 12 357 Beethoven 1770 1827 1798.5 Piano Sonatas 3-1
Lin 12 823 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 2 La campanella
Lin 12 824 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 2 La campanella
Lin 12 932 Prokofiev 1891 1953 1922.0 Toccata
Lin 12 990 Schubert 1797 1828 1812.5 Impromptu op.90 D.899 4
Lin 12 998 Schubert 1797 1828 1812.5 Impromptu op142 3
Lin 12 1009 Schubert 1797 1828 1812.5 Piano Sonatas 664-1
Lin 12 1012 Schubert 1797 1828 1812.5 Piano Sonatas 664-2
Lin 12 1014 Schubert 1797 1828 1812.5 Piano Sonatas 664-3
Yarden 12 544 Chopin 1810 1849 1829.5 Etudes op 10 12
Yarden 12 755 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-1
Yarden 12 756 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-1
Yarden 12 757 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-2
Yarden 12 758 Haydn 1732 1809 1770.5 Keyboard Sonatas 39-3
Yarden 12 1030 Schumann 1810 1856 1833.0 Kreisleriana 1
Yarden 12 1034 Schumann 1810 1856 1833.0 Kreisleriana 2
Yarden 12 1037 Schumann 1810 1856 1833.0 Kreisleriana 3
Yarden 12 1039 Schumann 1810 1856 1833.0 Kreisleriana 4
Yarden 12 1042 Schumann 1810 1856 1833.0 Kreisleriana 5
Yarden 12 1045 Schumann 1810 1856 1833.0 Kreisleriana 6
Yarden 12 1048 Schumann 1810 1856 1833.0 Kreisleriana 7
Lisiecki 11 19 Bach 1685 1750 1717.5 Fugue bwv 857
Lisiecki 11 46 Bach 1685 1750 1717.5 Fugue bwv 873
Lisiecki 11 47 Bach 1685 1750 1717.5 Fugue bwv 873
Lisiecki 11 106 Bach 1685 1750 1717.5 Prelude bwv 857
Lisiecki 11 133 Bach 1685 1750 1717.5 Prelude bwv 873
Lisiecki 11 134 Bach 1685 1750 1717.5 Prelude bwv 873
Lisiecki 11 303 Beethoven 1770 1827 1798.5 Piano Sonatas 24-1
Lisiecki 11 304 Beethoven 1770 1827 1798.5 Piano Sonatas 24-1
Lisiecki 11 311 Beethoven 1770 1827 1798.5 Piano Sonatas 24-2
Lisiecki 11 650 Chopin 1810 1849 1829.5 Etudes op 25 11
Lisiecki 11 995 Schubert 1797 1828 1812.5 Impromptu op142 1
MiyashitaM 11 7 Bach 1685 1750 1717.5 Fugue bwv 848
MiyashitaM 11 12 Bach 1685 1750 1717.5 Fugue bwv 854
MiyashitaM 11 94 Bach 1685 1750 1717.5 Prelude bwv 848
MiyashitaM 11 99 Bach 1685 1750 1717.5 Prelude bwv 854
MiyashitaM 11 338 Beethoven 1770 1827 1798.5 Piano Sonatas 27-1
MiyashitaM 11 358 Beethoven 1770 1827 1798.5 Piano Sonatas 3-1
MiyashitaM 11 360 Beethoven 1770 1827 1798.5 Piano Sonatas 3-2
MiyashitaM 11 484 Chopin 1810 1849 1829.5 Ballades 4
MiyashitaM 11 500 Chopin 1810 1849 1829.5 Berceuse op 57
MiyashitaM 11 610 Chopin 1810 1849 1829.5 Etudes op 10 8
MiyashitaM 11 651 Chopin 1810 1849 1829.5 Etudes op 25 11
WangA 11 15 Bach 1685 1750 1717.5 Fugue bwv 854
WangA 11 21 Bach 1685 1750 1717.5 Fugue bwv 857
WangA 11 57 Bach 1685 1750 1717.5 Fugue bwv 880
WangA 11 102 Bach 1685 1750 1717.5 Prelude bwv 854
WangA 11 108 Bach 1685 1750 1717.5 Prelude bwv 857
WangA 11 144 Bach 1685 1750 1717.5 Prelude bwv 880
WangA 11 271 Beethoven 1770 1827 1798.5 Piano Sonatas 21-1
WangA 11 636 Chopin 1810 1849 1829.5 Etudes op 25 10
WangA 11 661 Chopin 1810 1849 1829.5 Etudes op 25 11
WangA 11 839 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 6 Theme and Variations
WangA 11 840 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 6 Theme and Variations
MunA 10 75 Bach 1685 1750 1717.5 Fugue bwv 889
MunA 10 159 Bach 1685 1750 1717.5 Prelude bwv 889
MunA 10 463 Chopin 1810 1849 1829.5 Ballades 1
MunA 10 515 Chopin 1810 1849 1829.5 Etudes op 10 1
MunA 10 743 Debussy 1862 1918 1890.0 Pour le Piano 1
MunA 10 827 Liszt 1811 1886 1848.5 Gran Etudes de Paganini 2 La campanella
MunA 10 914 Mozart 1756 1791 1773.5 Piano Sonatas 12-1
MunA 10 917 Mozart 1756 1791 1773.5 Piano Sonatas 12-2
MunA 10 920 Mozart 1756 1791 1773.5 Piano Sonatas 12-3
MunA 10 1005 Schubert 1797 1828 1812.5 Moment Musical no 1

Performances

There are a total of 1067 performances. Being so long, I’m not going to tuck him into a collapsible note. Rather, let’s zoom in on a single composition; let’s see who has performed “Fugue bwv 846” by Bach.

tbl_perfs |>
  filter(title == "Fugue bwv 846") |>
  select(id, composer, title, performer) |>
  kable()
id composer title performer
1 Bach Fugue bwv 846 Shi

There is only one performer for “Fugue bwv 846” in the dataset, so let’s go with a juicier one, “Fugue bwv 848”

tbl_perfs |>
  filter(title == "Fugue bwv 848") |>
  select(id, composer, title, performer) |>
  kable()
id composer title performer
2 Bach Fugue bwv 848 Denisova
3 Bach Fugue bwv 848 Lee
4 Bach Fugue bwv 848 LeeSH
5 Bach Fugue bwv 848 Lin
6 Bach Fugue bwv 848 Lou
7 Bach Fugue bwv 848 MiyashitaM
8 Bach Fugue bwv 848 Mizumoto
9 Bach Fugue bwv 848 SunY
10 Bach Fugue bwv 848 Zhou

Here ends our brief glimpse into the dataset. We shall now delve into the music data itself.

Music

Music is stored in various ways in the ASAP repository. The format that we have cultivated and by far are the most interested in (as data scientists) is the CSV format. Follows is a schema of our internal representation of a composition:

Column Description Type
id The composition ID integer
composer The composer of the piece string
year_born The composer’s birth year (Wikipedia) integer
year_died The composer’s death year (Wikipedia) integer
year_written This as an approximation that is accurate to half the composer’s lifetime. integer
title The composition’s title string
performer The performer. Extracted from midi_performance. NA for scores string|NA
type The type of data. Music is all note string
time_offset The number of seconds from the beginning float
time_duration The duration in seconods float
tick_offset The number of MIDI ticks from the beginning integer
tick_duration The duration in MIDI ticks integer
note_midi The MIDI value of the note integer
note_normal The MIDI value normalized, [0, 1] integer
velocity The velocity of the note, [0, 1] integer
pretty The named representation of the note. Matches key-signature’s spelling string
canonical The canonical representation of the note. We always use the flat equivalent string
density How dense notes are in the vicinity of this note. float
interval The interval between this note and the following note string
tempo The tempo of the current point in the piece integer
key_signature The key signature at this point in the piece string
time_signature The time signature at this point in the piece string
ticks_per_quarter The number of ticks in a quarter note integer

Load

tbl_perfs_music <- load_music(tbl_perfs) |>
  bind_rows()
tbl_perfs_music |>
  head() |>
  kable()
id composer year_born year_died year_written title performer type time_offset time_duration tick_offset tick_duration note_midi note_normal velocity pretty canonical density interval tempo time_signature key_signature ticks_per_quarter
1 Bach 1685 1750 1717.5 Fugue bwv 846 Shi note 0.0000000 0.8476562 384 651 60 0.4724409 0.2834646 C4 C4 0.0510 M2 120 4/4 C Major 384
1 Bach 1685 1750 1717.5 Fugue bwv 846 Shi note 0.5950521 0.7343750 841 564 62 0.4881890 0.4015748 D4 D4 0.0510 M2 120 4/4 C Major 384
1 Bach 1685 1750 1717.5 Fugue bwv 846 Shi note 1.2057292 0.7486979 1310 575 64 0.5039370 0.4251969 E4 E4 0.0510 m2 120 4/4 C Major 384
1 Bach 1685 1750 1717.5 Fugue bwv 846 Shi note 1.8645833 1.0117188 1816 777 65 0.5118110 0.4488189 F4 F4 0.0510 M2 120 4/4 C Major 384
1 Bach 1685 1750 1717.5 Fugue bwv 846 Shi note 2.8867188 0.4427083 2601 340 67 0.5275591 0.3464567 G4 G4 0.1531 M2 120 4/4 C Major 384
1 Bach 1685 1750 1717.5 Fugue bwv 846 Shi note 3.0273438 0.3385417 2709 260 65 0.5118110 0.4251969 F4 F4 0.1021 m2 120 4/4 C Major 384
tbl_scores_music <- load_music(tbl_scores) |>
  bind_rows()
tbl_scores_music |>
  head() |>
  kable()
id composer year_born year_died year_written title performer type time_offset time_duration tick_offset tick_duration note_midi note_normal velocity pretty canonical density interval tempo time_signature key_signature ticks_per_quarter
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.0000 0.2489583 240 239 60 0.4724409 0.6299213 C4 C4 0.1369 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.2500 0.2489583 480 239 62 0.4881890 0.6299213 D4 D4 0.1369 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.5000 0.2489583 720 239 64 0.5039370 0.6299213 E4 E4 0.0913 m2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.7500 0.3739583 960 359 65 0.5118110 0.6299213 F4 F4 0.1826 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 1.1250 0.0614583 1320 59 67 0.5275591 0.6299213 G4 G4 0.1826 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 1.1875 0.0614583 1380 59 65 0.5118110 0.6299213 F4 F4 0.1369 m2 120 4/4 C Major 480
tbl_fugue_bwv_846 <- load_music_by_title(tbl_scores, "Fugue bwv 846")[[1]]
tbl_fugue_bwv_846 |>
  head() |>
  kable()
id composer year_born year_died year_written title performer type time_offset time_duration tick_offset tick_duration note_midi note_normal velocity pretty canonical density interval tempo time_signature key_signature ticks_per_quarter
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.0000 0.2489583 240 239 60 0.4724409 0.6299213 C4 C4 0.1369 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.2500 0.2489583 480 239 62 0.4881890 0.6299213 D4 D4 0.1369 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.5000 0.2489583 720 239 64 0.5039370 0.6299213 E4 E4 0.0913 m2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 0.7500 0.3739583 960 359 65 0.5118110 0.6299213 F4 F4 0.1826 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 1.1250 0.0614583 1320 59 67 0.5275591 0.6299213 G4 G4 0.1826 M2 120 4/4 C Major 480
1 Bach 1685 1750 1717.5 Fugue bwv 846 NA note 1.1875 0.0614583 1380 59 65 0.5118110 0.6299213 F4 F4 0.1369 m2 120 4/4 C Major 480
tbl_fugue_bwv_848 <- load_music_by_title(tbl_perfs, "Fugue bwv 848") |>
  bind_rows()

Visualization

tbl_fugue_bwv_846 |>
  group_by(canonical) |>
  summarise(n = n()) |>
  arrange(desc(n)) |>
  mutate(
    canonical = factor(canonical, levels = canonical)
  ) |>
  ggplot(mapping = aes(x = canonical, y = n)) +
  geom_bar(stat = "identity") +
  labs(
    title = str_c(
      "Frequency of Notes in ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
    x = "Note",
    y = "Frequency"
  )

tbl_fugue_bwv_846 |>
  group_by(interval) |>
  summarise(n = n()) |>
  arrange(desc(n)) |>
  mutate(
    interval = factor(interval, levels = interval)
  ) |>
  ggplot(mapping = aes(x = interval, y = n)) +
  geom_bar(stat = "identity") +
  labs(
    title = str_c(
      "Frequency of Intervals in ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
    x = "Interval",
    y = "Frequency"
  )

tbl_fugue_bwv_848 |>
  ggplot(mapping = aes(x = time_offset, y = velocity, color = performer)) +
  geom_line() +
  labs(
    title = str_c(
      "Velocity in ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
    x = "Time",
    y = "Velocity"
  )

tbl_fugue_bwv_848 |>
  ggplot(mapping = aes(x = velocity, color = performer)) +
  geom_histogram(bins = 30) +
  labs(
    title = str_c(
      "Velocity histogram for ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
  ) +
  facet_wrap(~performer)

tbl_fugue_bwv_848 |>
  ggplot(mapping = aes(sample = velocity, color = performer)) +
  geom_line(stat = "qq")

  labs(
    title = str_c(
      "Velocity QQ plot for ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
  ) +
  facet_wrap(~performer)
NULL
tbl_fugue_bwv_848 |>
  ggplot(mapping = aes(x = time_duration, color = performer)) +
  geom_histogram(binwidth = 0.025) +
  coord_cartesian(xlim = c(0, 1)) +
  labs(
    title = str_c(
      "Note duration histogram for ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
  ) +
  facet_wrap(~performer)

tbl_fugue_bwv_848 |>
  mutate(
    v_velocity = var(velocity)
  ) |>
  ggplot(mapping = aes(x = v_velocity, color = performer)) +
  geom_histogram(binwidth = 0.025) +
  coord_cartesian(xlim = c(0, 1)) +
  labs(
    title = str_c(
      "Note velocity variance histogram for ",
      tbl_fugue_bwv_846$title,
      ", ",
      tbl_fugue_bwv_846$key_signature,
      ", ",
      tbl_fugue_bwv_846$composer
    ),
  ) +
  facet_wrap(~performer)

tbl_fugue_bwv_848 |>
  group_by(performer, canonical) |>
  summarise(
    count = n()
  ) |>
  ggplot(mapping = aes(y = canonical, x = count, color = performer)) +
    geom_col(position = "dodge")

Possibilities for Analysis

Note Length Timeline

Is there a difference in average note length over time?

tbl_perfs_music |>
  group_by(id) |>
  summarise(
    avg_note_duration = mean(time_duration)
  ) |>
  left_join(tbl_perfs, by = "id") |>
  ggplot(mapping = aes(x = avg_note_duration, y = year_midlife, color = composer)) +
  geom_jitter() +
  labs(
    title = "Average Note Length by Composition",
    x = "Average Note Duration",
    y = "Approximate Year of Composition"
  )

Note Velocity Timeline

Is there a difference in average velocity over time?

tbl_perfs_music |>
  group_by(id) |>
  summarise(
    avg_velocity = mean(velocity)
  ) |>
  left_join(tbl_perfs, by = "id") |>
  ggplot(mapping = aes(x = avg_velocity, y = year_midlife, color = composer)) +
  geom_jitter() +
  labs(
    title = "Average Note Length by Composition",
    x = "Average Note Duration",
    y = "Approximate Year of Composition"
  )

Boxplot

tbl_perfs_music |>
  ggplot(mapping = aes(x = factor(year_written), y = velocity)) +
  geom_boxplot() +
  labs(
    title = "Note Velocity by Year",
    x = "Year Written",
    y = "Note Velocity"
  )

Note Length by Performer

Is there a difference in average note length by performer?

tbl_fugue_bwv_848 |>
  group_by(performer) |>
  summarise(
    avg_note_duration = mean(time_duration)
  ) |>
  ggplot(mapping = aes(x = avg_note_duration, y = performer, color = performer)) +
  geom_col() +
  labs(
    title = "Average Note Duration by Performer",
    x = "Average Note Duration",
    y = "Performer"
  )

Note Velocity Mean by Performer

Is there a difference in average note velocity by performer?

tbl_fugue_bwv_848 |>
  group_by(performer) |>
  summarise(
    avg_velocity = mean(velocity)
  ) |>
  ggplot(mapping = aes(x = avg_velocity, y = performer, color = performer)) +
  geom_col() +
  labs(
    title = "Average Note Velocity by Performer",
    x = "Average Note Velocity",
    y = "Performer"
  )

Note Velocity Variance by Performer

Is there a difference in note velocity variance by performer?

tbl_fugue_bwv_848 |>
  group_by(performer) |>
  summarise(
    v_velocity = var(velocity)
  ) |>
  ggplot(mapping = aes(x = v_velocity, y = performer, color = performer)) +
  geom_col() +
  labs(
    title = "Note Velocity Variance by Performer",
    x = "Note Velocity Variance",
    y = "Performer"
  )

Note Duration Variance by Performer

Is there a difference in note duration variance by performer?

tbl_fugue_bwv_848 |>
  group_by(performer) |>
  summarise(
    v_time_duration = var(time_duration)
  ) |>
  ggplot(mapping = aes(x = v_time_duration, y = performer, color = performer)) +
  geom_col() +
  labs(
    title = "Note Duration Variance by Performer",
    x = "Note Duration Variance",
    y = "Performer"
  )

Note Duration Variability Over Time

Is there a difference in note duration variability over time?

Mean Composition Note Length

What is the distribution of the average note length by compositions?

Histogram

tbl_scores_music |>
  group_by(id) |>
  summarise(
    avg_note_duration = mean(time_duration)
  ) |>
  ggplot(mapping = aes(x = avg_note_duration)) +
  geom_histogram(binwidth = 0.025) +
  labs(
    title = "Average Note Duration by Composition",
    x = "Average Note Duration",
    y = "Frequency"
  )

Boxplot

tbl_scores_music |>
  ggplot(mapping = aes(x = factor(year_written), y = time_duration)) +
  geom_boxplot() +
  labs(
    title = "Note Duration by Composition",
    x = "Year Written",
    y = "Note Duration"
  )

Note value over Time

Has it changed?

Boxplot

tbl_scores_music |>
  ggplot(mapping = aes(x = factor(year_written), y = note_normal)) +
  geom_boxplot() +
  labs(
    title = "Note Value by Year",
    x = "Year Written",
    y = "Note Value"
  )